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Abstract 

In this paper we briefly describe a combined sym- 
bolic and numeric approach for solving mathemat- 
ical models on parallel computers. An experimen- 
tal software system, PIER, is being developed in 
Common Lisp to synthesize computationally in- 
tensive and domain formulation dependent phases 
of FEA solution method. Quantities for domain 
formulation like shape functions, element stiffness 
matrices etc. are automatically derived using sym- 
bolic mathematical computations. The problem 
specific information and derived formulae are then 
used to generate (parallel) numerical code for FEA 
solution steps. A constructive approach to spec- 
ify a numerical program design is taken. The 
code generator compiles application oriented in- 
put specifications into (parallel) f77 routines with 
the help of built-in knowledge of the particular 
problem, numerical solution methods and the tar- 
get computer. 

Introduction 

Engineers and scientists frequently encounter mathe- 
matical models based upon partial differential equa- 
tions (PDEs) in a wide variety of applications. Finite 
element analysis (FEA) (Zienkiewicz 1980) is a ma- 
jor computational tool for the numerical solution of 
boundary and initial value problems that arise in stress 
analysis, heat transfer and continuum mechanics of all 
kinds. The problem domain is first discretized into a 
suitable mesh of elements . Then well-selected analyt- 
ical approximations are used for solution within each 
element. The global solution for all discrete points (el- 
ement nodes ) of the mesh is computed by numerical it- 
erations taking into account inter-element interactions 
and boundary conditions. 

Simple FEA applications can be performed with 
canned packages such as NFAP (Chang 1980) and 
NASTRAN. Situations involving complicated bound- 
ary conditions or element properties, non-linear ma- 
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terial properties, require customizing many aspects of 
FEA. In such cases, the finite element solution pro- 
cess consists of a symbolic computation phase followed 
by a numerical computation phase. Depending on the 
problem at hand, the symbolic computation phase may 
involve construction and analysis of solution approxi- 
mations } simplification of large analytical expressions , 
changing variables and/or coordinates to simplify the 
problem } operating on matrices and tensors with sym- 
bolic entries , as well as integration and differentiation 
of analytical expressions . Results of the symbolic com- 
putation phase are then used to construct numerical 
programs. 

Frequently the mathematical models and related 
computer programs are revised during research, engi- 
neering and production. Numerical convergence prob- 
lems may also require that a different numerical proce- 
dure be used for FEA solution steps. When the models 
are three-dimensional, or use large data sets, the pro- 
gram execution speed is critical. Writing programs for 
parallel computers to speed-up execution is indeed not 
a trivial task for modelers. Also, parallel programs 
written for a parallel machines can not be ported to 
other machines without significant re-programming ef- 
fort. State of the art parallelizing compilers (Kuck 
1978), (Allen Ic Kennedy 1985) take an existing (se- 
quential) code as input and can produce programs for 
the target parallel machine. However, these compil- 
ers parallelize scientific and engineering applications on 
the model of linear algebra and either completely ig- 
nore the domain specific parallelism naturally present 
in the problem or query the user during compilation. 

In recent years, there has been an increase in re- 
search and development efforts to alleviate these prob- 
lems. Existing approaches combine symbolic and nu- 
merical computing in various ways. These (coupled) 
symbolic- numeric systems generally take the user input 
in a very high-level form and automatically generate 
numerical code in a procedural programming language 
like ill or C for the target computer. Some notable re- 
cent projects are Ellpack (Rice, Boisvert, and Ronald 
1985), Sinapse (Kant et al. 1990), Alpal (Cook 1990), 
PPEQSOL (Hirayama, Ikeda, and Sagawa, 1991), (Pe- 
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skin 1987), and (Steinberg and Roache 1990). Many 
of these projects have adopted finite difference solution 
method for PDEs. 

Our Approach 

We have been working for a number of years (Wang 
1986), (Sharma and Wang 1988a), (Sharma 1988b), 
(Sharma and Wang 1990), (Sharma 1991a) in this re- 
search direction and our primary PDE solution method 
is FEA. We identify key solution steps of FEA which 
are compute-intensive and are reprogrammed ev- 
ery time new element formulations or boundary con- 
ditions are used. Our approach is to employ symbolic 
computation to generate sequential and parallel nu- 
merical codes for the key FEA solution steps. The 
code is generated in (the parallel version of) 177 on the 
target parallel computers (currently include Sequent 
Balance shared memory and distributed-memory In- 
tel iPSC/860). Based on the user input, quanti- 
ties such as element shape functions and strain- 
displacement matrices can be derived using symbolic 
mathematical computations. The derived formulas are 
used to generate numerical code for computing element 
stiffness matrix, solution of system of equations and 
other solution steps. The generated code can be read- 
ily combined with existing FEA codes. The overall 
scheme is pictorially depicted in Fig. 1. We are de- 



Figure 1: Overview of Approach 

veloping a new FEA code generator named PIER to 
build upon our previous work in this area and to break 
new grounds. PIER is Common Lisp (CL)-based and 
can work directly with the CL-based MAXIMA. It can 
be easily ported to other CL-based symbolic computing 
systems. PIER generates sequential and parallel codes 
for the key solution steps. In next two sections we 
briefly discuss our design objectives and PIER’s pro- 
gramming knowledge. PIER input specifications and 


the code generation scheme are overviewed in subse- 
quent sections. We conclude the paper by outlining 
some relevant issues. 

Design Goals 

Previous FEA code generators, such as FINGER, P- 
FINGER and PDEQSOL, are equipped with a fixed 
number of numerical algorithms for FEA solution steps 
and the algorithms are parallelized and implemented 
for one specific parallel computer. Porting the code 
generator to other machines, thus requires work from 
scratch. This is a major drawback which should be 
overcome. Our first design requirement addresses this 
issue. 

The code generator provides a set of architecture- 
independent input specifications to design and ex- 
press numerical algorithms used in FEA solution 
steps. 

In generating parallel programs from the user in- 
put specifications, it is possible to take advan- 
tage of domain independent parallelism which exists 
among concurrently schedulable code modules and do- 
main specific parallelism such as carrying out FEA 
(sub)computations in the element-by-element (Winget 
and Hughes 1985) formulation as opposed to the as- 
sembled formulation, or substructuring the FE mesh 
etc. This leads to our second design requirement. 

Automatically generates good implementation 
mappings (for input specifications) to modern 
high-performance computers with the help of 
built-in knowledge of the application domain, 
FEA solution method, and the target program- 
ming environment. 

These design requirements allow engineers and sci- 
entists to customize the FEA solution process for the 
desired application area and the problem instances. 
Only input specifications needs to be altered without 
worrying about implementation details or the target 
architecture. 

System Overview 

PIER provides a knowledge-based programming envi- 
ronment to the modelers. The architecture of the en- 
vironment comprises following components. 

1. A Programming Knowledge-Base. 

2. A set of User Input Specifications. 

3. Code Generator. 

The programming knowledge-base provides generic 
Operations (a set of basic linear algebra computations 
including matrix-vector product, vector inner prod- 
uct, solving triangular system of equations etc.) and 
domain specific Operations (a set of basic finite ele- 
ment analysis computations including assembling ele- 
ment stiffness matrices, deriving shape functions, vec- 
tor preconditioning etc.). A PIER Operation has four 
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parts: prologue/epilogue , a set of algorithm schemas, 
control dependence graph (CDG) and the associated 
cost-model The schemas are stated as templates writ- 
ten in Common Lisp, which include assignments, con- 
ventional control constructs, and array /scalar compu- 
tations. The CDG represents the execution depen- 
dence among several sub-computations in the Opera- 
tion and the cost-model determines the execution cost 
(computation and communication costs) of the Oper- 
ation. PIER Operations implement the intended com- 
putations in one of the following execution styles: 

(51) Assembled: Execute for assembled data. 

(52) FullyParallel: Execute for individual element 

data concurrently. 

(53) BlockParallel : Execute for a block 1 of element 

data. 

(54) Scalar: Execute for one element data at a time. 

The user input specifications provide methods to 
specify problem parameters, desired symbolic deriva- 
tion and combine Operations to construct an FEA al- 
gorithm. The code generator generates ill programs 
from the user input specifications for the target archi- 
tecture. The generated code is compiled and linked on 
the target machine and executed. The programming 
knowledge-base also provides completed specifications 
for frequently used FEA algorithms. The user can, 
however, specify a new algorithm and add the same to 
the knowledge-base. The knowledge about program- 
ming the target parallel architecture is represented as 
a set of transformations. These transformations con- 
vert CDGs into equivalent ill templates. Porting to 
other computers, thus, require developing the set of 
relevant transformations. 

PIER Input Specifications 

One of the major research objectives in PIER is to de- 
sign a set of very high level input specifications which 
are used by scientists/engineers as well as system de- 
velopers to describe FEA computations and problem 
instances. We advocate a bilingual programming style 
in which application oriented specifications (i.e. termi- 
nology and notations as used in standard FEA texts) 
can be mixed with regular ill syntax to express an 
FEA algorithm. In designing the PIER input specifica- 
tions we seek that the specifications should be easy to 
understand and easy to produce by scientists/engineers 
and the user specifies only the functionality desired and 
leaves the implementation details to PIER. 

The overall approach is to add statements (which 
represent domain-specific computations) to ill . The 
set of powerful statements are primarily intended for 
FEA algorithms. Although this scheme could easily 
work in other areas of scientific computing. The in- 
put specifications support the definition of the element 

1 A block is a set of elements in which no two elements 

share a node 


mesh, nodal properties, various data arrays, symbolic 
derivation and specification of numerical algorithms 
for the solution procedures. Statements defining stor- 
age strategies for FEA data arrays, high-level sym- 
bolic/numerical computations (PIER Operations), and 
straight-line 2 sequences of Operations (PIER Modules ) 
can be intermixed with regular ill constructs to spec- 
ify a desired numerical algorithm. We now describe 
the underlying programming model. 

The Programming Model 

In general the systematic software development process 
begins with informal requirement specifications. This 
is followed by one or more than one design phases, 
which define a system structure meeting requirement 
specifications. The design phase identifies software 
modules and their organization. The text book style 
description of numerical algorithms can be expressed 
at this level of abstraction with relative ease and PIER 
automates rest of the software development phases i.e. 
detail design and implementation for the algorithm. 
While generating parallel code, the user also specifies 
the resource constraints (i.e. number of maximum pro- 
cess/processors etc.). The input specifications are hier- 
archical and the user expresses numerical algorithms in 
a bottom-up fashion by creating abstractions of higher 
level in terms of lower ones. This is depicted in the 
Fig. 2. Let us describe each level briefly. 



Figure 2: Hierarchy of Computations 

• Operation: An Operation is the smallest unit of 
computation (provided in PIER knowledge-base). 
An operation usually represents a single textbook 
equation with one variable on the left hand side and 
an expression involving one or more variables on the 
right hand side. For example, the equations 

Tempi = r • z 

2 No sequence control is involved 
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u — a • p 
Temp?, — p > u 

involved in the PCG (Preconditioned Conjugate 
Gradient) algorithm (Hughes, Ferencz, and Hallquy- 
ist 1987) can each be specified by an Operation. An 
Operation can specify either a symbolic derivation 
or a numeric computation. For an operation, the 
variable on the left-hand side is its output data ob- 
ject, while those on the right-hand side sire its input 
data object A PIER Dataobject specifies numerical 
values associated with elements/nodes organized in 
a structured fashion (e.g. matrix, vector). 

• Module: A Module consists of a sequence of Opera- 
tions with no entry or exit points except at the begin- 
ning and at the end of the module. In other words, 
control flow enters at the beginning and leaves at 
the end of a Module, Fig. 3 illustrates Module 
specifications. In the example, Module, In, Out, 
Begin , End are keywords whereas VecInnerVec and 
MatTimesVec represent Operations (for numerical 
computations). 


Each step can be solved by more than one (sym- 
bolic/numerical) procedure and expects a fixed set 
of input quantities and computes a fixed set of re- 
sults. Depending on the problem formulation, the 
algorithm used for a solution step may be different. 
Some standard algorithms such as Gauss quadra- 
ture, Gaussian elimination and preconditioned con- 
jugate gradient are built into PIER. Others can be 
supplied by the user through PIER input specifica- 
tions. 

To derive/generate desired FEA computations (sym- 
bolic formulae/f 77 code) the user must first specify the 
element mesh, the element properties, the data arrays 
for material matrix and nodal coordinates. This is fol- 
lowed by the specifications to derive desired element 
formulae. We now give an example (Fig. 5) where 
the problem domain is divided into 256 linear trian- 
gular elements. Total number of nodes in the mesh is 
153. The local degrees of freedom at each node is 1. 
For complete syntax and detailed examples for PIER 
input specifications the reader is referred to (Sharma 
and Wang 1991b). 


Module CSSA, (In : (a,r,z ,p) ,Out : (Tempi ,Temp2) 
Begin 

Tempi = VecInnerVec(r ,z) 
u = MatTimesVec (a, p) 

Temp2 = VecInnerVec(p,u) 

End 


Figure 3: PIER Module Specification Example 

• Algorithm: An Algorithm is specified by combin- 
ing Modules with til constructs. PIER also sup- 
plies, from its knowledge-base, certain standard al- 
gorithms that can be used directly. Fig. 4 illustrates 
an example of Algorithm specifications. 


PIER Algorithm Specification Example 

Algorithm :Foo, (In: (a,b) ,Out : (x) ) 

Begin 

. . . 177 code . . , 

C Module Call. 

« (Temp 1 , Temp2 ) =Module ( CSSA , a , r , z , p) >> 
. . . f77 code . . . 


Figure 4: PIER Algorithm Specification Example 

• FEA Solution Step: The breakup of the FEA 

solution process recognizes eight solution steps. 


C Defining Triangular Element Mesh, 
m = Mesh(Dim:2, Nodes : 153, Elements :2S6) 
e = Element (Ldim: 1 , Nodes : 3, Shape: Triangle) 
Dataobject x,Name :XNodalCoordinate 
Dataobject y ,Name : YNodalCoordinate 
Dataob j ect enm , Name : ElementNodalMatr ix 
Dataobject m, Name : Mat Max 

C Deriving element approximations. 
h=Der iveShape (Algorithm : Polynomial , e ) 
b=DeriveBMatrix( Algorithm : Displacement , d ,h) 

C Generating Numerical Code for a FEA Step, 
(x) =SolveSy st em( Algorithm : Pcg,k ,r,File :foo) 


Figure 5: PIER Input Specification Example 

Synthesis Process 

In PIER the FEA programs are synthesized by the 
method of composition of program components. The 
Operations (in PIER knowledge-base), Modules (User- 
defined) and Algorithms (User-defined) represent pro- 
gram components in the increasing order of hierarchy. 
The PIER code generator incrementally refines input- 
specifications into FEA programs. The code genera- 
tion is overviewed in Fig. 7 and consists of following 
phases 

1. Parsing Input Specifications 

2. Problem Definition 

3. Code Generation 
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In the first phase various input specification con- 
structs are identified and translated into PIER in- 
ternal Common Lisp function calls. The Algorithm 
template is recognized and preserved. The func- 
tions related to the problem definition (parameters of 
FEA mesh/element and symbolic derivation of element 
properties) are executed first. This assigns appropri- 
ate values to control variables in the environment. The 
first phase also identifies and analyses PIER Modules, 
Module Calls and input/output data object specifi- 
cations. The user-specified element approximations 
are derived using symbolic mathematical computations 
(using AKCL-MAXIMA) and the MAXIMA internal 
representations are translated into equivalent Common 
Lisp expressions. 

The code generation phase generates code for each 
Module and constituting Operations. Each Module 
Call in the Algorithm specification generates code for 
a Module. Symbolic expressions, if any, appearing in 
the Module body are translated into equivalent Op- 
eration specifications. The code for a Module is gen- 
erated as a set of 177 subroutine calls and the corre- 
sponding subroutines. After generating code for all of 
the Modules referred to by Module Calls, the Gencray 
(Weerawarana and Wang 1989) translator is called to 
translate the generated Common Lisp forms into equiv- 
alent 177 statements and the holes in the Algorithm 
template are filled appropriately. In the following sub- 
sections we describe code generation from Module and 
Operation followed by an overview of problem solving 
with PIER. 

Module Code Generation 

The code generation from PIER Modules is modeled 
by Qowgraphs. A flowgraph is a collection of flownodes , 
which represent task instances and directed edges, 
which represent data dependencies among flownodes. 
The code generator derives the flowgraph representa- 
tion from the sequence of Operations specified in the 
Module body and schedules the flowgraph onto the tar- 
get architecture. Operations and data objects form 
flownodes and edges of the flowgraph respectively. The 
flownodes, thus, represent coarse grained tasks which 
have unique cost-models and may be assigned different 
execution styles. 

The schedular of the code generator takes as input a 
flowgraph, a processor count, and Module execution 
style (optional). The schedular assigns appropriate 
number of processors and an execution style to each 
Operation. The execution style must be consistent 
with the input and output data objects of the Opera- 
tion. The schedule should conform to the cost-models 
of Operations and respect the partial order represented 
by the flowgraph. The overall objective is to produce 
a schedule with the lowest total cost. 

Detaijs of the parallel code generation can be found 
in (Sharma and Wang 1990) and (Sharma 1992). 


Operation Code Generation 

Many of PIER Operations involve regular computa- 
tions and are internally parallelized in the data par- 
allel fashion. Execution styles ( BlockParallel , Assem- 
bled etc.) refers to methods of partitioning the in- 
put/output data objects. The user-specified quanti- 
ties and output of the schedular are used to refine 
the algorithm schemas, which implements the Oper- 
ations. PIER accepts input data objects organized in 
various specialized storage strategies (e.g. Symmetric 
Matrix, Banded Matrix etc.). The appropriate data 
reference mapping are automatically generated in the 
output code. 

Problem Solving with PIER 

To use PIER in practice, the first step is to prepare a 
mathematical model describing the physical situation. 
The modeler, then, prepares the weak statement fol- 
lowed by dividing the problem domain in a series of 
elements. Here, we are not concerned with the dis- 
cretization process and assume that one of the sev- 
eral available domain decomposition software tools is 
used. However, domain discretization data (i.e. ele- 
ment type, nodal coordinates, list of nodes associated 
with each element) are to be organized in an appropri- 
ate fashion for PIER consumption. To derive compu- 
tations for any FEA solution steps, the modeler must 
first define the element mesh. This is followed by input 
specifications for FEA solution steps which include de- 
sired quantities/methods for symbolic derivation and 
numerical computation. If the desired numerical algo- 
rithm is not part of the PIER’s knowledge-base, the 
complete algorithm has to be expressed in input speci- 
fications. The modelers can use PIER to generate ill 
code for FEA solution steps. The generated code, if 
desired, can be executed in conjunction with an ex- 
isting FEA package. The process is outlined in Fig. 
6 . 

Issues 

As indicated earlier, in the present work we are focus- 
ing on two issues, that we consider critical, in FEA 
code generation: programmable code generator and 
code generation for multiple parallel architectures. 

Progammable Code Generator 

FEA solution method involves symbolic mathemat- 
ical manipulation and numerical computation with 
large data sets. To solve a FEA solution step the mod- 
elers make choices for the domain mesh, element ap- 
proximations, and numerical solution algorithm. The 
choices made are based on: the characteristics of the 
posed model, target (parallel) computer, and numer- 
ical convergence properties. Therefore the FEA solu- 
tion programs are highly specialized. Code generation 
systems which would cover all possible cases are bound 
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Figure 6: Problem Solving with PIER 


to be large, difficult to maintain, and slow. Our ap- 
proach is to identify and create library of Operations 
(and possibly Modules), which can generate program 
components for specific situations. These components 
are reusable among FEA solution procedures. A so- 
lution algorithm can be fabricated using library com- 
ponents and non-FEA specifications in standard 177. 
The users can customize the generators to their spe- 
cific needs. 

Parallel Code Generation 

A major open issue in parallel code generation is 
the modeling of architecture and the representation 
of machine specific parallel programming knowledge. 
In PIER, the computations are represented in an ar- 
chitecture independent formulation (flowgraphs). The 


Operations generate instances of flowgraphs and attach 
appropriate code segments to the flownodes. The flow- 
graph schedular is machine-specific and is the back-end 
of PIER. Parallel programming rules are transforma- 
tions from flowgraphs representations to equivalent 177 
templates. 



Generated 
Code 

Figure 7: Code Generation Scheme 
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